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D^imx^u^cucuitwithmultipteedmemmy 1806.2003 



TO, 1 inv CT don reh «e s « om app aratas i llwlliohprocesscsejIecoted 
^""teaceesslhesamestagleportmemorycireuit. 



^^^^^^^^^^^ ^ 
rcahzB a pseudo multi-port memory. 

execu te a write pm^andasecoodci^ t m a «execu te . rerfp ^ stato ^ e ^ to 
aadaF^Om^^,^^^^^^^^^ 

reading speed of data out of me mad buffer respectively. 

A high memory speed is disadvantageous. B results to high power 
consumption and it imposes limits on the opemttog speed of me apparatus. 



Among others, ft is an obje« of the invention to pmvide fcr an apparatus with 
a lower memory access speed can be used. 

Among others, it is a further object of the invention that the circuits mat 
access the memory can operate each under control of their o.nsuhstantiaUy periodic clock 
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2 18.06.2003 
signal, without suspending execution for a clock cycle to wait during memoiy access by other 
circuits. 

Among others, it is a further object of the invention to reduce the required 
access speed when at least two of the circuits that access the memory have mutually different 
clock periods. 

Among others, it is another object of the invention to provide for an apparatus 
with a single port memory and at least two circuits that access the memory independently in 
which no multiposition FIFO queue is used. 

The apparatus according to the invention is set forth in Claim 1. In the 
apparatus a timing circuit realizes a variable phase delay between the periodic start times of 
validity intervals in which a first processing circuit outputs access requests (containing e.g. 
memory addresses) and acceptance of the access requests in the validity interval (accepting a 
memory access request as used herein only means that the memory circuit starts handling the 
request in a way that the request need no longer be maintained). Handling an access request 
from a second processing circuit increases the phase delay before a next request can be 
accepted. However, such an access request from the second processing circuit is handled only 
when the resulting increased phase delay for the first data processing circuit remains within 
the validity interval. In subsequent validity periods the phase delay is reduced in successive 
steps, until the delay in a particular period of validity ties at least one minimum memory 
repetition period before the end of the particular validity interval. By permitting a variable 
phase delay the speed requirements imposed on the memory access speed are reduced. 

Because the phase delay remains within the validity interval of the requests 
the requests from the first data processing circuit can always be captured within the validity 
interval during which they are output by the first data processing circuit There is no need to 
make the first data processing circuit pause for an operation cycle to wait for acceptance of a 
request Accordingly, in an embodiment a single register is used to capture request 
information, without using a FIFO for buffering a number of requests that may increase to 
two or more requests. The register may even be shared for successively storing requests from 
both data processing circuits, since requests from the first data processing circuit always 
remain valid until after requests from the second data processing circuits can be discarded. 

In an embodiment the timing circuit comprises respective clocking circuits for 
periodically clocking operation of the first and second processing circuits, so that a sum of 
the frequencies at which new access request are made available (or can be made available 
except for processing dependent reasons) is less than the inverse of the minimum memory 
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repetition period Thus, it is ensured that all access requests from periodically clocked 
processing circuits are handled and mat the second processing circuit only makes access 
requests when they can be handled sufficiently early, so that handling finishes before the end 
of a clock period of the first data processing circuit in which a subsequent access request has 
been made. 

In an embodiment the timing circuit comprises an asynchronous arbitration 
circuit Each processing circuit outputs access requests (containing among others a memory 
addresses) at periodic start times (clock ticks) and the arbitration circuit sequences conflicts 
Once the memory can accept a request, the arbitration circuit accepts the first made request 
first If bofoprocessing circuits have made access requests simultaneously, the arbitration 
circuit decides in which order Ihe accesses are accepted. 

In an embodiment the variable delay is realized by introducing a self-timed 
activity that repeatedly first receives a request via the arbitration circuit from one of the 
processing circuits and then performs the required memory access. This self-timed activity 
generates a third clock (timing) signal for accessing the memory, so that the memory 
accesses have variable phase shifts with respect to the processing clocks (accepting a 
memory access request as used herein only means that the self-timed circuit has copied the 
request in abuffer register). In this way, when all requests have to be serviced, the memory 
speed requirement is reduced to: the performance of the memory should not be less than the 
sum of the access rates of the processing circuits. Note mat this only imposes a lower speed 
requirement on memory with respect to prior art solutions if the access rates of the different 
processing circuits differ. 

By including the arbiter before the self timed activity delays introduced by the 
arbiter do not contribute to the minimum memory repetition period. This reduces the speed 
requirements imposed on the memory. 

If both processing circuits request to access the memory simultaneously and 
the request of the fastest processing circuit is handled last then this request is accepted'after 
the memory access time, which is less than the clock period of the clock of the festest 
processing circuit During the time interval in which the access request from the fastest 
processing circuit is being handled by the self-timed activity the next access request from the 
fastest processor circuit can already occur. This second request is accepted with less delay 
relative to the clock time at which the request is offered than the previous one. In subsequent 
accesses the phase delay is reduced in successive steps, until either the delay is zero or the 
slow processing unit requests an access. By the time that the latter occurs, the delay between 
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me clock and acceptance of a memory request in the fast processing unit has been reduced to 
such an extent that the time left after accepting a request and toe next clock tick is at least toe 
memory access tone (and some timing overhead). 

Accordingly, in an embodiment a single register in toe self-toned activity is 
used to capture toe access information, without using a FIFO for buffering a number of 
requests that may increase to two or more requests. Since toe self-toned activity is behind toe 
arbiter, toe register is shared for successively storing requests from both data processing 



circuits. 



The minimum memory repetition period does not need to be much higher than 
toe tone interval between successive requests from toe fast processing circuit If no request 
from a processing circuit should be missed toe sum of toe access request frequencies of both 
processing circuits should be less than toe inverse of toe memory access tone. When toe 
access frequency of one of processing circuit is less than that of toe other, toe required 
memory access speed is therefore less than twice that of toe fast processing circuit Typically 
when toe access frequency of toe slow data processing circuit is one tenth of that of toe fast 
data processing circuit the memory speed needs to be only a ten percent higher than toe 
speed of toe fast processing circuit. 

A data register may be provided for receiving read data from toe memory in 
response to read requests. When read requests are generated only at a low frequency toe read 
data needs to be refreshed at a low rate, so that it can be processed by one or more of toe 
processing circuits without special timing requirements. In particular, when only toe second 
data processing circuit produces read requests at a frequency lower than toe request 
frequency of toe first data processing circuit it is ensured that toe read data will be available 
within a fixed delay for use by toe second data processing circuit. 

In an embodiment toe read and write data width differ, read data (for toe 
second data processing circuit) containing a multiple of write words (from toe first data 
processing circuit). Thus a high data rate can be realized with a low request rate from toe 
second data processing circuit permitting a minimum memory repetition frequency that is 
only slightly above toe request frequency of toe fast data processing circuit 

The memory can be made up of banks that are arranged at successive 
geometrical positions along a row in an integrated circuit In that case toe wire delays will 
significantly contribute to toe access time, which consists of sum of toe memory access tone 
and toe wire delays. The reduction of toe access frequency by these wire delays can be 
alleviated by performing toe accesses to toe different memory banks in a pipeline with 
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suocessrve stages coupled to respective ones of the memory banks. Preferably, each memory 
bank has a self-timed activity which repeatedly first receives an access request from its 
predecessor in the pipeline and then passes this request on to its successor while performing 
an access to its bank if this is required. In such an embodiment the performance of the 
memory is limited, if at all, by the wire delays between two neighboring memory banks 
instead of the delays of wires running along all successive banks. 



These and other objects and advantageous aspects of the apparatus according 
to the invention will become apparent from the following figures and their description. 

Figure 1 shows a circuit with a memory and circuits for two processes; 

Figure 2 shows access period duration and delay as a function of time;' 

Figure 3 shows signals involved in the circuit of figure 1; 

Figure 4 shows part of a timing circuit; 

Figure 4a shows another part of a timing circuit; 

Figure 5 shows a memory architecture; and 

Figure 6 shows an alternative circuit with a memory. 



Figure 1 shows a circuit with a first data processing circuit 10a, a second data 
processing circuit 10b, a first clock circuit 1 la, a second clock circuit 1 lb, a selector circuit 
12, a multiplexer 14, a synchronization circuit 15, a register 16, a memory 18 and a data 
register 19. First clock circuit 1 la is coupled to first data processing circuit 10a and selector 
cn-cuit 12. Second clock circuit 1 lb is coupled to second data processing circuit 10b and 
selector circuit 12. First and second data processing circuits 10a,b have access request 
information outputs coupled to inputs of multiplexer 14, which in turn has an output coupled 
to an input of register 16. Selector circuit 12 has a selection output coupled to a control input 
of multiplexer 14 and a timing control output coupled to synchronization circuit 15 
Synchronization circuit 15 has a timing output coupled to register 16 and memory 18 Data 
register 19 has inputs coupled to memory 18 and an output coupled to second data processing 
circuit 10b. e 

In display driver application memory 18 stores image information, such as 
pixel data, and second data processing circuit is a display control circuit that controls pixel 
content on a display screen (not shown) dependent on data read from memory 18 ("data 
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processing" as understood herein includes, but is not limited to, controlling information on a 
display screen). In this application first data processing circuit 10b is for example a processor 
that computes the pixel data, a receiver circuit or a camera processor. First data processing 
circuit 10a writes the pixel data to memory 18 so that it can be read later by second 
processing circuit 10b. The access request information from data processing circuits 10a,b 
contains for example an address for addressing a location in memory 18, a control bit to 
enable/disable access, a read/write control bit and optional data. However, it should be 
realized that the invention is not limited to such requests. For example, memory may contain 
an address counter for updating an address for use with requests from one of the data 
processing circuits 10a,b. In this case, no address needs to be supplied in the access request 
information from that data processing circuit Other information may be supplied by default 
In the extreme all access request information could be supplied by default as long as it is 
indicated that the request comes from a particular data processing circuit 10a,b for which 
defaults are available. 

In operation, timing of the circuit is controlled by the combination of clock 
circuits 1 la,b and selector circuit 12. First and second data processing circuits 10a,b operate 
in cycles determined by their respective clock circuits 1 la,b. Each data processing circuit 
10a,b is able to produce new access request information in each of its particular cycles. 
Multiplexer 14 passes the access request information from a selected one of the data 
processing circuits 10a,b to register 16, where the access request information is latched. (A 
conventional multiplexing circuit may be used, such as a bus type circuit wherein one of the 
inputs is conductively connected to the output). Register 16 passes the latehed information to 
memory 18, which accesses a memory location under control of the access request 
information. In case of a write request accompanied with an address and data, memory 18 
stores the data into the location addressed by the address. In case of a read request 
accompanied by an address memory 18 reads data from the addressed location and causes the 
data to be latched in data register 19. Selector circuit 12 determines from which data 
processing circuit 10a,b access request information is latched in register 16. Selector circuit 
12 triggers synchronization circuit 15 which determines when the access request information 
is latched and when a memory access cycle using the latched access request information is 
started. 

The cycle repetition rate of first and second data processing circuit 10a,b can 
differ substantially, for example by a factor of ten. In an example first processing circuit 10a 
has a cycle duration during which valid access request information is supplied of Pl=100 
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usee (Fl-l/Pl) and second processing circuit 10b has a corresponding duration of P2=1000 
nsec(F2=l/P2). Memory 18 may be accessed with a variably selectable period between 
successive accesses. The rninimum duration of a memory access cycle Pm is me sum of the 
memory access time (Mac^ 

5 control circuit (Cdel) . Therefore Pm^Macc+Wdel+CHfl Th™~- . 

^ttco-rwaei+udel. The maximum access frequency of 
to memory Fm is the inverae of foe mem0Iy acoess Fm=1/pm ^ ^ ^ 

te-penc.es of foe firat and second date pK.cessing ciralit IOa>b fc ^ e 

fiequenciesofFl=10MhzandF2=l MHzforevnmnio o ^ 
in . . i MHz for example, a memory frequency of at least 11 

10 Mnz is required. 

When it is known foat first da* processing circuit 10. does not issue new 
access reon^ta aa ofha cycles, but ooly ma fraction k of its cycle <k-2/3 for exampU) 
men me condition can be relaxed even further to 



15 



Fm>k*Fl+F2 



20 



w^^fortoneedtoprocessonlyk.Fl access requeste ftmn firs, date prc^easmg 

Selector circuit 12 selects access request information from slow data 
processing circui, ! Ob to be copied to register !6 as soon as possible after slow date 
processing circuit 10b makes to access request information available. This means to, 
section of access request information ftom fast date processing circui, 1 0a is delayed a, mis 



^^^^^D^eentoumefi^datepmcesaingeircmtlOa 
25 makers, request information and to toe to, to aceesa mqu« Information 

■s copted mto register 16 as a function of fime. In addition to figure snows foe actual 
duration P between successive cycles initiated by syuebronization circui, 15, to nigger 
eepyingintoregisteriSaudtourggeramemory access cycle. (It may be noted to, de»ys 

30 been drawn). 

U is seen that initially to delay D has a small value DO and to repetition 
pertod of memory access cycUa P is equal to to duration Tl of a repetition cycle of firs, date 
processing encui, ,0a. At instente t2 where access repeat information ftom se«md date 
processing circui, 10b is selected to delay D incraases by an amount Tm equal to a 
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minimum length memory access cycle. Subsequently, the length of the repetition period of 
memory access cycles P drops to the minimum length Tm for a number of access cycles. This 
causes the delay D to decrease by the difference Tl -Tm after each access cycle until the 
original small delay DO is reached. After that the repetition period Tm of the memory access 
5 cycles is increased to the cycle duration Tl of first data processing circuit 10a. 

It should be noted that the worst case delay D is less than the cycle duration of 
first data processing circuit 10a in the sense that access request information that is available 
after the initial delay DO is still available after the additional delay DO+Tm, because the 
access cycle starts with a delay DO after the access control information becomes available 

10 and because Tl>Tm. The selection of the cycle fiequencies Fm>Fl+F2 ensures that the delay 
D is reduced to DO before a next cycle of second data processing circuit 10b starts and causes 
the delay to increase. In this way it is ensured that data there is no need to make first data 
processing circuit 10a wait for access to memory, or for an additional buffer to buffer the 
access request information from first data processing circuit 10a. 

15 It should also be noted that, as will be described, memory access may be 

pipelined. In this case the duration Pm does not correspond to the fiill time needed for 
memory access, but only to the duration of a processing the request in a single pipeline stage. 
When Pm is determined by the initial pipeline stage (or when this is the only stage) it 
includes the duration for processing in this stage, plus the memory access time, plus wire 

20 delays. 

Figure 3 shows timing of the various cycles. Traces CLK1 and CLK2 show 
clock signals from first and second clock circuit 1 la,b, traces ACC1, ACC2 show access 
request events. Trace SEL shows a selection signal from selector circuit 12 and trace CLK3 
shows memory cycle trigger pulses. 

25 Initially a situation with small delay DO of figure 2 holds. Access request 

information from first data processing circuit 10a is selected. In response to the first two 
clock pulses in CLK1 pulses in CLK3 are almost immediately generated to load the access 
request information into register 16 and to subsequently process the access request under 
control of the loaded access request information. Thus the delay between the pulses in CLEG 

30 is equal the cycle duration Tl of CLK1 at this time. 

The start of a third clock pulse 30 in CLK1 arrives simultaneously with the 
start of a clock pulse in CLK2 (this is a worst case situation; the clock pulses need not be 
coincide). Now the selection circuit selects the access information from second data 
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15 



20 



25 



30 



processing circuit 10b and a pulse in CLK3 is aim™* • J; 18 06.20 

thenulseinrTir^ , ,J unmedxately generated in response to 

the pulse in CLK2 to load the access request information into register 16 

circuit lOaand.^^ 

10a and, as soon as allowable in view of tne sp eed of ^ ^ rf ^ ^ 

CI"? cycle. It shTuld be 1^ 

™^ delay for — F-.^t.^^^ 

untu first dataproeessingcircuit lOacbanges tne access request information ACC1 in 

correspondence wimmenextpulse34.mre Sp onsetomene^ 
^^assoonasanewm^ 

mtervul Tin and so on. Thus, the delay between Dulses in f*T jri a an °" ,m *' 
CUOisgraduaUyrettaccd ""-P"*-- CK, and c^ruaponding pulse, in 

ACC, sh ou U h , ^T dtetoton ^ to ^* ea ^^M>™at.on 
A^^heso^^tHe^^^^^ 

Wo^er, *e tin* in™ wMen the ae^ ^ 

v-y an^ (fc re™np,e leas nM n0.5nseeX SOflM ,nns does no, aignffleamly afleet n« 
mtntnnum allowable delay a, a cycle ftecprency of 10MHz t*tuy affect the 

ofaduMcJpZ^rjT^, 3 ^ 0 ^^^^*^ 
^^^2. If the start of the clock pulse in CLK2 precedes the start of the clock pulse^rf first 
cucui, 10a, the access r^es. iron, second p^cessing ^Hob is ,Z 

^r^Tcitrh^nrr^-" 8 -^^ 

, , . S MtobeenlM, ^4™<Icontinue S ifneedbeiiitotheneat 

d °*^°^<^l™cessingcircu^ 

Jes, Iron, firs, dosing cireu,, , 0a in tbe ne* dock cycic need no. he dehtyL a, 
all, or more tune is left in the next clock cycle than shown in figure 3 
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It will be appreciated that in an embodiment data processing circuits 10a,b 
need not request access in each of their clock cycles. If so, Ihe clock signals applied to 
selector circuit 12 may be disabled in those cycles in which no request is made. Thus, the 
increase in delay D is reduced more quickly in case of a disabled access request from first 
5 data processing circuit 10a, or an increase in the delay D is prevented in case of a disabled 
access from second data processing circuit 10b. 

No data needs to be returned from memory 18 when both data processing 
circuits 10a,b only write data. Data register 19 is provided for the case that slow (second) 
data processing circuit 10b generates read requests. In case of a read access memory 18 sends 

10 the data that has been read and a load signal when read data is available to data register 19. 
The circuit has the effect that read data is always available at least from a predetermined time 
Tm+Am after the corresponding access request, allowing for a delay Am to read the data and 
a maximum delay Tm to finish access for an access cycle previous to the read cycle. It may 
be noted that the duration of a (pipeline step of a) memory read cycle may differ from a 

15 memory write cycle, hi this case duration of the memory the read cycle should be so short 
that the delay until the access request information ACC1 is changed is longer than the 
memory read cycle. Because the clock of second data processing circuit 10b is much slower 
the read data will be loaded only after it has been loaded into data register 19. 

When access requests from data processing circuits 10a,b are synchronized the 

20 read data is available in a predetermined time interval Da-Db after the access request In this 
case data register 19 may be omitted, or timed from second data processing circuit 10b. 

When both data processing circuits can issue read requests, a data register 19 
is preferably provided for each, and loaded according to the source of the read request (for 
example under control of a delayed SEL signal). 

25 Figure 4 shows an embodiment of a partial circuit of the selector circuit The 

circuit has inputs for coupling to clock circuits 1 la,b (not shown) and a handshake interface 
REQ, ACK for coupling to synchronization circuit 15 (not shown). The circuit contains an 
asynchronous arbiter 40 (mutual exclusion element), a pair of clock flip-flops 41a,b, a pair of 
AND gates 42a,b an pair of asymmetric Muller C-elements 44a,b and an OR gate 46. Arbiter 

30 40 which is of a type known per se, that raises the output corresponding to an input where the 
input signal is raised, with the exclusion that at most one output is kept high at a time. Muller 
C-elements 44a,b are also known per se, and are of a type that raise their output signal if all 
their input signals are logic high and lower the output signal when the input not marked with 
a + becomes low. 
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Ou^ms of clock circuits ! la,b (,>o t ^ „ to ^ 

clock flrp-flops 4Kb which have output, cuup,ed to ^ rf ^ 40 . ^ 
ou^ 0 c^ ledtofes ,i I1 ^ ofANI)gIltes42ajbrespectively ;AM} 

44a,b. MuOer C-elernen* 44a,b have outpu* coup,ed to inverting inpots of AND gates 42a.b 
auo tor^t ^ of olock flip . flopa ne ACR ^ Qf ^ 

coup^u.s^cinpasofMuUcrC^en.cu^a.b . TTe outout of OR gs^ois 
coupUd to the REQ output of the asynchronous interface. Hie outputs of AND gates 42a.b 
areusedtocontrolmuMpleserMfootshown). 

outont f * fa ° Pera,i<m Wtol ^ ° f fc ° l0Ck ^ "a." «s outpu. signal, the 
output of the corresponding flipflop41a,b goes high. As soon as at least one ofite tatteis 

* .ow^TI 7'^ " "* ° R ^ -rtT theACKsfgnal 

L *° ™ *™ cycie. Note that as soon as the arbiter 
ha, rnnde ,ts output to AND gate 42a ,ow, i t can bumerh.te.y a«^ a waiting request floor 
^processrng c*cni, ,0b. However as ,ong as the previous menrory hacdsbL Z 

A^gato42btobe M o K hi g h. b « S wa y »e t w 0 ^ OT0f ^ lm *^ 
preveotogtwotoetooryacuesseaftomdif^tso^stooveriap. 

v.,- . i na ^ emb< ^ , ^ cbr ^™^15isofa ly pek^ 
wlnchh. handshake nrterfneasto seieotor citcui, ^^^^.^ 

"TT Sy " d — » *"* " — » ackuowied^ and cauT^ 
regutier 16 to load access request date. When the data is loaded and the request has been 
~ — circuit, 5 deasaer* the Kknowledga Once Jacoesa rl^, 

TIT^T* ^"^^^ " .handshake widttZory I8 

Onco una handshake has been eomptoed svuchrt^ circuit ,5 is ready to aeknowlL 
a next request ftom selector circuit 12. acimowledge 

• „< t .^^^ fi ^ y ^ Wa ^^^tofsynohroniz Mi on 
cucrut ,5, whtch uses handshake signals. In this enthodhnent svnenronizadon chcui, 15 
connuns e^eater circnit 150, a sequencing circuit ,52 and a MtUler C element ,54 
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Repeater circuits 150 and sequencing circuits 152 are standard asynchronous circuit 
components that for which implementations are known per se. For the sake of clarity Ihese 
circuit components have been drawn symbolically. A request input from selector circuit 12 
(not shown) and a first request output from sequencing circuit 152 are coupled to inputs of 
Muller C-element 154. Muller C element 154 has an output coupled to an acknowledge input 
of selector circuit 12, a first acknowledge input of sequencing circuit 152 and a clock input of 
register 16. A second request output and acknowledge input of sequencing circuit 1 52 are 
coupled to memory 18 (not shown). A second request output and acknowledge input of 
sequencing circuit 152 are coupled to repeater circuit 150. 

In operation a request signal from selector circuit 12 is handled when 
sequencing circuit 152 also oulputs a request signal. In this case the request information is 
clocked into register 16 and the requests are acknowledged to selector circuit 12 and 
sequencing circuit 152. In response sequencing circuit 152 sends a request signal to memory 
18, which then performs a memory access with the access information stored in register 1 6. 
As soon as the memory access is completed, the memory sends an acknowledge signal back. 
The sequencer 152 then sends an acknowledge signal to repeater 150, which responds with a 
request signal that, in turn is passed to Muller C element 154. 

When selector circuit 12 sends a request signal before the sequencing circuit 
has sent a new request signal, Muller C-element 154 does not respond until sequencing 
circuit has sent a new request signal. Thus, clocking of register 16 and sensing a request to 
memory 18 is delayed until at least a minim um memory access period has passed since the 
start of the previous memory access. 

In many applications, such as for instance mobile display divers, the memory 
consists of several memory banks. The memory banks are then often arranged sequentially 
over a long geographical distance, for example at locations corresponding to different pixel 
ranges on a display. These long geographical distances lead to large wire delays (Wdel) and 
consequently to low memory frequency Fm. This problem can be circumvented by pipelining 
the memory access requests. 

Figure 5 shows an example of such a memory circuit for use in the circuit of 
figure 1. The circuit contains a number of memory banks 52a-d and a number of 
synchronization circuits 50a-d The synchronization circuits 50a-d are arranged as stages in a 
pipeline, which passes the access request information from register 1 6. The first stage in this 
pipeline has a handshake interface to synchronization circuit 15. In addition handshake 



PHNL030688EPP 



15 



30 



• 13 18.06.2003 

interfaces are provided between pairs of successive stages in lie pipeline. The 
synchronization circuits have outputs coupled to memory banks 52a-d 

In operation synchronisation circuits 50a-d each repeatedly first receive and 
latch access request information from its left neighbor then apply this information to its 
5 associatedmemory bank whflepassmg me informations 

receive the access request information is acknowledged as soon as possible once the access 
request ^formation has been stored, after which the access request information (e g an 
address, r/w control and optionally write data) is applied to the corresponding memory bank 
Ane^request is accepted onlyif the ba^ 
10 foe information is passed to its right neighbor. 

^^appreciated that foe ar^^ 
length Tm of the memory access cycle by reducing the effect of wire delays between to two 
communicating circuits, penmtting a high memory frequency Fm. This in turn allows high 
cycle frequency for data processing circuits 10a,b. It wfll also be appreciated that other forms 
ofprpehningn^ybeu^ 
a sufficiently fast cycle time without pipelining. 

In an embodiment, read data from banks 52a-d is output in parallel in response 
toareadrequestlnthis embodiment the read data from eachbank is preferably latched in a 
respective corresponding data register (not shown) when the relevant bank has produced foe 
data. In this way very read words from the memory are wider than write words which is 
useful for example for displays where very wide words (e.g. image lines) are needed at a low 
frequency. 

Although foe circuit has been described in terms of handshaking interfaces it 
wul be appreciated that instead of foe handshaking interfaces one-sided trigger interfaces 
may be used. For example synchronization circuit 1 5 could be arranged to trigger a pulse of 
nummum duration upon reception ofarequest and be ready to accept a new request at foe 
end of thepulse. The pulse may used to trigger memory 18 andregister 16. When it can be 
guaranteed that foe triggered circuits respond sufficiently quickly to be ready when foe next 
trigger pulse arrives no handshake is needed. A handshake, however, has foe advantage that it 
isconmositionalmthatfoes* 

foe speeds of foe other submodules. Similarly, a handshake interface towards clock circuits 
lla,b maybe used instead of foe pulse interface described in foe context of foe figures In 
fois embodiment clock circuits 1 1 a,b delay foe next pulse of foe clock by an amount 
sufficient for foe relevant data processing circuit 1 Oa,b to produce foe next access request 



20 



25 



PHNL030688EPP 



10 



15 



14 18.06.2003 
information and start the next cycle when the request is acknowledged. Thus, ihe clock 
frequency that clock circuits 1 la,b apply to data processing circuits 10a,b may be adjusted 
However, it should be noted that in this embodiment the first clock circuit 1 la for the fast 
(first) data processing circuit 10a only adapts its frequency. It does not need to need to make 
a sudden large phase jump of the size of a memory cycle when a memory cycle is inserted for 
me second data processing circuit 10b. .Similarly, instead of asynchronous interfaces 
synchronous interfaces may be used, for example by deriving clock signals from clock 
circuits 1 la,b from a common clock source, e.g. by dividing a higher frequency clock by 
different frequency division ratio's, or by phase locking one of the clock circuits to the other. 

In this case, control pulses for register 16 and memory 18 may also be derived 
from a clock that is synchronized to the other clocks circuit For example, if clocks 1 la,b run 
synchronized at frequencies N1*F0 and N2*F0 respectively then a clock for register 16 could 
be made to run at Nl *F0 when there is no delay and at (N1+N2)*F0 upon receiving an 
access request from second data processing circuit 10b until the delay has been caught up. 

Instead of dividers or locked clocks one may also use a clock multiplexer for 
providing a clock to memory 18, which passes a signal from a separate clock of memory 18 
or from first clock circuit 11a of first processing circuit. In this embodiment the separate 
clock is started upon an access request from second processing circuit 10b and runs at a 
frequency above that of first clock circuit 1 la. The signal from the separate clock is passed 
20 after granting an access request from second processor 10b at least until the separate clock 
has gained so much on me clock signal of first clock circuit 1 la that it starts in the early part 
of the period of the first clock circuit 11a which is more than a minimum memory access 
period before the end of the period of first clock circuit 1 la. 

It will be appreciated that Ihe architecture allows a single register 1 6 to be 
25 used to buffer all information between data processing circuits 10a,b and memory, but of 
course more registers may be used. 

Figure 6 illustrates an embodiment in which a register 60a is used between the 
first data processing circuit 10a and multiplexer 14 instead of register 1 6. Register 60a can be 
loaded at substantially the same time as register 16 of figure 1 (however, loading may be 
30 omitted at time points when access is accepted from the second data processing circuit 10b). 
No register is necessary for second data processing circuit 10b when T2, the duration when 
access request information from second data processing circuit 10b is longer than 2*Tm, the 
worst case delay until the access request information has been processed by memory 18. 
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Although the invention has been described for access request information mat 
is supphed in parallel from a data processing circuit, it will be appreciated that, without 
debating from the invention this information maybe supphed partly, or wholly serially as 
long as this does not lead to violation of the timing constraint 

Similarly, it should be appreciated that more than two data processing circuits 
10a,b, each with their own output for periodically producing access request information at its 
own frequency, may be coupled to register 1 6 via multiplexer 14. For example, several fast 
data processing circuit and one slow data processing circuit may be used if the sum of the 
! CCBSS freqUCnCieS dOCS DOt CXCeed * e memory access frequency. In another example one 
fest data processing circuit and several slow data processing circuits may be used. 

In general, if there are N data processing circuits and if N-l times the 
minimum memory cycle duration fits into the cycle duration of any one of the processors the 
crrcmt ensures that access request information will be captured by register before the end of 
me cycle duration even if another processing circuit is granted access first, provided that the 
sum of the frequencies is less than the inverse of the minimum memory cycle length 
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1 " A da** processing apparatus, comprising 

a first and second data processing circuit (1 0a,b), each with an output for 

respecnve access requests each during a respective validity duration interval- 

- a multiplexing circuit (14) with inputs coupled to the outputs of the first and 

second data processing circuits (10a,b); 

»— e,y from an ourpm of me ^ (]4) ^ k [Mst ^ ^ 

memory period Mowing accept of, p re oed i o gK ce SSm ,»e St; 

circ„iU10a M a *7 8CteDi,(Ua *- 12 ' ^^tothe toandaec^ddampmceaamg 
cuem.OOa.b) and me memory cimni, (1 5, ■ 8) , a„d arranged m time opemdonof.be fee JL 
circuit (,<*,„) ^ „ ^ ^ ^ » d 

^-^^periotfcnlwimalongerperi^ 

7", ^'***'**»-«*CI>* "2, .5) being arranged toaetoaccepJoenrne 

" i * ^^'^^"^^pZwinrmd* 
vabdr* dnradon .nrervaia, so «ha dre poainon is delayed wirhin me vmidity duradoo inm™, 

femtoto.daraproeeasingcteui.withmaubaeq^periodaofvtf^^ 

2- A ^ , ^^^««^g<oCtoiml,wheremmetimingeirem^ 
composes fe, «j second clocking circuits (1 la ,b) coep,.d to dock inpu* of the to and 
seccnddamprocessing circuit 0 0a,b) respectively, whereby the access requests, if made am 
^ by the fe, and second da* ng ctauit < , 0a , b) „ a ^ ^ 
of the fe, and aecond Cocking ciroui, 0>a,b) respective*, the sum of the fe, and seal 
frequency bemg smaller man me inverse of me minimum memory repetition period 
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3. A data processing circuit according to Claim 2, wherein the timing circuit 

comprises an asynchronous arhiter circuit (40) with inputs coupled to the first and second 
clocking circuit (1 l,ab) and an output coupled to a control input of the multiplexing circuit 
5 (14), the arbiter circuit (40) being arranged to control from which of the first and second data 
processing circuit (10a,b) the multiplexing circuit (14) will pass the access request, the arbiter 
circuit (40) selecting among the data processing circuits (10a,b) on a first come first served 
basis of transitions in clock signals of the first and second data processing circuit (10a,b). 

10 4. A data processing circuit according to Claim 3, comprising an asynchronous 

timer circuit (15) with a trigger input coupled to asynchronous arbiter circuit (40) and 
arranged to generate a timing signal for accessing the memory (1 8), the asynchronous timer 
circuit (40) triggering a memory access cycle each time when asynchronous arbiter circuit 
(40) selects a request and a previous minimum memory repetition period has finished. 

15 

5. A data processing circuit according to Claim 1 , the memory circuit comprising 
a register (16) and a storage unit(18), the register (16) coupled between the first data 
processing circuit (10a) and the storage unit (18), for latching access request information 
from at least the first data processing circuit (10a) for use by the storage unit (18) under 

20 control of the timing circuit (1 la,b, 12, 15), upon the delay determined by the timing circuit 
(lla,b, 12,15). 

6. A data processing circuit according to Claim 1 , wherein the memory circuit 
(12, 18) comprises a series of successively coupled pipeline stages (50a-d), for executing 

25 successive steps in response to an access request, the minimu m memory repetition period 
corresponding to a time interval needed by one of the pipeline stages to execute one of the 
steps. 

7. A data processing circuit according to Claim 6, wherein the memory circuit 
30 (16, 18) comprises memory banks (52a-d), each coupled to a respective one of the pipe-line 

stages (50a-d), for processing each of the requests successively in different ones of the banks 
(52a-d). 
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8. A ^P™--gcircuitac^^^ 

arranged at successive positions alone a snatialiv,™ a a; are 

5 P ^ row on 311 integrated circuit, with read data 
outputs for oulputting data read in resnonse to ™>o,i 

™« , \ response to read requests among the requests at successive 

positiomalongtherow.thesewnddatapiowssina^wfnn^ • 
< - , processing circuit (10b) comprising display driver 

5 circuits, coupled to the outputs. *P»yariver 

9- A ^ teproC6M ^circuitaccordmgtoClaiml,compri S m^ 
coupledbetweentoememoiy circuity 18)andthe second data proceslgc^™ 

^reques.andforsupplv^gtoereaddatoto^ 
^dlmgofaccessrequestsof&efetdatoprecessingcircuitOOa). 

H2 1* A ^^ eSSMg ^ taCC °^ to ^-^wherem 

6)compnses a plurality of banks (52a-d) with a first data word length, write requests 
among the requests comprising bank selection information and write dataTf the fiTTta 

plurahtyofthebanksCSaa-^topa^elinresponsetoeachre^ 
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11. 



■ data processing circuit according to Claim 1, wherein the second data 



Passing circuit (10b) comprises display drivers for processing read data from the memory 
hy driving a content of a display device dependent on the read data. 



12. 



A method of processing data the method comprising 



- providing a memory circuit (16, 18) capable of accepting successive access 

25 ^estseachtimeafteraim ^ 

dataprocessmgdrcu.tsdOa.b), access ^r^ g ^^ a ^J ^ 
memory repetition period; mmunum 

time-muMplexing access requests from the a* ^ second output to a* 
memory circuit (12, 16); v 

- connouing a deiay between the aar, of u* validity duration ta whjch 

me^ou^,ou^ te ,neacce SS „ mdKc ^ 

penoda of validity, to detoy being reduced in successive steps during appUcauon of 
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successive access requests from the first output, until the delay in a particular period of 
validity lies at least one minimum memory repetition period before the end of the particular 
period of validity; 

subsequently causing acceptance of at least one of the access requests from the 
second output, increasing the delay before a next access request of the first data processing 
circuit is accepted within the period of validity in which this next access request remains at 
the output 
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ABSTRACT: 



A data processing apparatus contains several processing circuits each 
operating under control of its own periodic clock signal, so that the clock signals may have 
different fiequencies and/or can he autonomous. The several processing circuits each have an 
output for outputting memory access requests, which remain at the output for a validity 
duration interval defined hy the clock signal of the particular processor. A multiplexing 
circuit multiplexes the access requests to a memory. The memory needs a minimum memory 
repetmon period before it can accept an access request following acceptance of apreceding 
access request. The cloek periods of the processing circuits are longer than the minimum 
memory repetition period. A timing circuit selects acceptance time points at which each 
particular access request from a first data processing circuit is accepted. The time point at 
which the particular request is accepted is always within the validity duration interval in 
which the particular access request is made. The timing circuit varies the position of the 
acceptance time points within the validity duration intervals, so that the position is delayed to 
make room for previously accepting an access request from another processor. The position 
is subsequently moved back toward a start of the validity duration interval in successive steps 
during application of successive access requests from the first data processing circuit 
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